Robust Regression
نویسنده
چکیده
1. Introduction One of the most important statistical tools is a linear regression analysis for many fields. Nearly all regression analysis relies on the method of least squares for estimation of the parameters in the model. A problem that we often encountered in the application of regression is the presence of an outlier or outliers in the data. Outliers can be generated by from a simple operational mistake to including small sample from a different population, and they make serious effects of statistical inference. Even one outlying observation can destroy least squares estimation, resulting in parameter estimates that do not provide useful information for the majority of the data. Robust regression analyses have been developed as an improvement to least squares estimation in the presence of outliers and to provide us information about what a valid observation is and whether this should be thrown out. The primary purpose of robust regression analysis is to fit a model which represents the information in the majority of the data. The properties of efficiency, breakdown point, and bounded influence are used to define the measure of robust technique performance in a theoretical sense. Efficiency can tell us how well a robust technique performs relative to least squares on clean data (without outliers). High efficiency is mostly desired on estimation. The breakdown point is a measure for stability of the estimator when the sample contains a large fraction of outliers (Hampel, 1975). It gives the minimum fraction of outliers which may produce an infinite bias. It is referred as the measure of global robustness in this sense. For example, least square has a breakdown point of 1/n. This indicates that only one outlier can make the estimates useless. In contrast, some robust regression estimates attaches approximately 50% breakdown point, and it is called a high breakdown point in this case. Lastly, bounded influence is designed to counter the tendency of least squares to allow exterior X-space or high leverage points to exhibit greater influence, which can be especially important if these points are outliers. Robust regression estimators were first introduced by Huber (1973, 1981), and it is well known as M-regression estimator. Rousseeuw(1984) introduced the least median of squares (LMS) and the least trimmed squares (LTS) estimators. These estimators minimize the median and the trimmed mean of the squared residuals respectively. They are very high breakdown point estimator. The high breakdown point estimation has …
منابع مشابه
Robust Estimation in Linear Regression with Molticollinearity and Sparse Models
One of the factors affecting the statistical analysis of the data is the presence of outliers. The methods which are not affected by the outliers are called robust methods. Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers. Besides outliers, the linear dependency of regressor variables, which is called multicollinearity...
متن کاملRobust Estimation in Linear Regression Model: the Density Power Divergence Approach
The minimum density power divergence method provides a robust estimate in the face of a situation where the dataset includes a number of outlier data. In this study, we introduce and use a robust minimum density power divergence estimator to estimate the parameters of the linear regression model and then with some numerical examples of linear regression model, we show the robustness of this est...
متن کاملA robust least squares fuzzy regression model based on kernel function
In this paper, a new approach is presented to fit arobust fuzzy regression model based on some fuzzy quantities. Inthis approach, we first introduce a new distance between two fuzzynumbers using the kernel function, and then, based on the leastsquares method, the parameters of fuzzy regression model isestimated. The proposed approach has a suitable performance to<b...
متن کاملTwo Robust Fuzzy Regression Models and Their Applications in Predicting Imperfections of Cotton Yarn
متن کامل
Simultaneous robust estimation of multi-response surfaces in the presence of outliers
A robust approach should be considered when estimating regression coefficients in multi-response problems. Many models are derived from the least squares method. Because the presence of outlier data is unavoidable in most real cases and because the least squares method is sensitive to these types of points, robust regression approaches appear to be a more reliable and suitable method for addres...
متن کاملFuzzy Robust Regression Analysis with Fuzzy Response Variable and Fuzzy Parameters Based on the Ranking of Fuzzy Sets
Robust regression is an appropriate alternative for ordinal regression when outliers exist in a given data set. If we have fuzzy observations, using ordinal regression methods can't model them; In this case, using fuzzy regression is a good method. When observations are fuzzy and there are outliers in the data sets, using robust fuzzy regression methods are appropriate alternatives....
متن کامل